code from Tutorial just not working

by: JGSmith123, 8 years ago


I have been following your pandas tutorial. (pretty amazing stuff - still a bit over my head, but I am tall so I am sure I will grab it soon enough ;) )

I am trying to join the data frames between the states in Quandl. Everything has worked thus far. I have had to make a few changes to what you wrote. One example is that python can't seem to work with "Quandl" any more. It must have changed since your video. Now it can only read "quandl".

Other than that, I have tried to stay extremely close to your code.

But I keep running into an error on


else:
    main_df = main_df.join(df)


I keep getting an error on this line saying:

ValueError: columns overlap but no suffix specified: Index(['Value'], dtype='object')

Ultimately I just copy and pasted your code into my IDE and I still get the same issue.

here is a copy of my entire code.


import quandl
import pandas as pd

api_key = open('quandl_api','r').read()
states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')

main_df = pd.DataFrame()

for abbv in states[0][0][1:]:
    query = "FMAC/HPI_"+str(abbv)
    df = quandl.get(query, authtoken=api_key)

    if main_df.empty:
        main_df = df
    else:
        main_df = main_df.join(df)



Do you have an idea of what I am doing wrong? I really appreciate your help!



You must be logged in to post. Please login or register an account.



I have continued and "cleaned up" the code as you mention in the pickles video (#7 I believe). It is printing for AL and AK but then I only get errors from there.

First I am posting my code and then the result with the errors:


import quandl
import pandas as pd
import pickle

api_key = open('quandl_api','r').read()

def state_list():
    fiddy_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')
    return fiddy_states[0][0][1:]

def grab_initial_state_data():
    states = state_list()
    
    main_df = pd.DataFrame()

    for abbv in states:
        query = "FMAC/HPI_"+str(abbv)
        df = quandl.get(query, authtoken=api_key)
        print(query)
        if main_df.empty:
            main_df = df
        else:
            main_df = main_df.join(df)
    
    print(main_df.head())
    pickle_out = open("fiddy_states.pickle","wb")
    pickle.dump(main_df, pickle_out)
    pickle_out.close()

grab_initial_state_data()


And here is the output

FMAC/HPI_AL
FMAC/HPI_AK
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-8-77aedc136fa5> in <module>()
     28     pickle_out.close()
     29
---> 30 grab_initial_state_data()

<ipython-input-8-77aedc136fa5> in grab_initial_state_data()
     21             main_df = df
     22         else:
---> 23             main_df = main_df.join(df)
     24
     25     print(main_df.head())

C:UsersTradiAnaconda3libsite-packagespandascoreframe.py in join(self, other, on, how, lsuffix, rsuffix, sort)
   4367         # For SparseDataFrame's benefit
   4368         return self._join_compat(other, on=on, how=how, lsuffix=lsuffix,
-> 4369                                  rsuffix=rsuffix, sort=sort)
   4370
   4371     def _join_compat(self, other, on=None, how='left', lsuffix='', rsuffix='',

C:UsersTradiAnaconda3libsite-packagespandascoreframe.py in _join_compat(self, other, on, how, lsuffix, rsuffix, sort)
   4381             return merge(self, other, left_on=on, how=how,
   4382                          left_index=on is None, right_index=True,
-> 4383                          suffixes=(lsuffix, rsuffix), sort=sort)
   4384         else:
   4385             if on is not None:

C:UsersTradiAnaconda3libsite-packagespandastoolsmerge.py in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator)
     33                          right_index=right_index, sort=sort, suffixes=suffixes,
     34                          copy=copy, indicator=indicator)
---> 35     return op.get_result()
     36 if __debug__:
     37     merge.__doc__ = _merge_doc % 'nleft : DataFrame'

C:UsersTradiAnaconda3libsite-packagespandastoolsmerge.py in get_result(self)
    210
    211         llabels, rlabels = items_overlap_with_suffix(ldata.items, lsuf,
--> 212                                                      rdata.items, rsuf)
    213
    214         lindexers = {1: left_indexer} if left_indexer is not None else {}

C:UsersTradiAnaconda3libsite-packagespandascoreinternals.py in items_overlap_with_suffix(left, lsuffix, right, rsuffix)
   4372         if not lsuffix and not rsuffix:
   4373             raise ValueError('columns overlap but no suffix specified: %s' %
-> 4374                              to_rename)
   4375
   4376         def lrenamer(x):

ValueError: columns overlap but no suffix specified: Index(['Value'], dtype='object')

-JGSmith123 8 years ago

You must be logged in to post. Please login or register an account.


In this case it looks like pandas is claiming you have to columns with the exact same name. They are probably both called value. If you have overlapping columns with a pandas join you need to specify an lsuffix or rsuffix in parms. I think that since I posted the tutorial quandl change the name of the column suggests the value.

-Harrison 8 years ago

You must be logged in to post. Please login or register an account.


Thank you very much for your reply on this Harrison. I am learning a lot from you.

I spent some time trying to work this out myself after your reply, but I am just not getting it. Can you please direct me to where I can learn about these parms that you are referring to? I have watched your #7 tutorial video again to see if you talk about it there, but it doesn't seem to.

Thank you in advance for your great help!

-JGSmith123 8 years ago

You must be logged in to post. Please login or register an account.


Also, would it be possible to setup an email notification to receive a notification when re get a reply to our topic? That would be really great!

-JGSmith123 8 years ago

You must be logged in to post. Please login or register an account.